import numpy as np
import scipy.linalg as sla
import numpy.linalg as la
import random
import matplotlib.pyplot as plt
%matplotlib inline
We want to prepare a matrix with deliberately chosen eigenvalues. Let's use diagonalization to write the matrix A:
A=UDU−1
where we set D to be a known matrix with the pre-defined eigenvalues:
D = np.diag([lambda1, lambda2, ..., lambdan])
We need to generate a matrix U that has an inverse. Orthogonal matrices are a great option here, since U−1=UT. We use QR decomposition to get an orthogonal matrix (you don't need to understand this method).
n = 4
X = np.random.rand(n,n)
U,_ = sla.qr(X)
D = np.diag([6,2,4,7])
Now we can use diagonalization to write A
A = U@D@U.T
And we can check that the eigenvalues are indeed what we expected:
eigl, eigv = la.eig(A)
print(eigl)
print(eigv)
We want to find the eigenvector corresponding to the largest eigenvalue in magnitude. For that, we can use np.argsort
, which returns the indices that sort the array in ascending order. Hence, we are interested in the last entry.
eig_index_sort = np.argsort(abs(eigl))
print(eig_index_sort)
eigpos = eig_index_sort[-1]
Recall that eigenvectors are stored as columns! Hence this would be the eigenvector corresponding to the largest (in magnitude) eigenvalue.
eigv[:,eigpos]
Let's also pick an initial vector:
x0 = np.random.randn(n)
x0
Power iteration should converge to a multiple of the eigenvector u1 corresponding to largest eigenvalue (in magnitude).
xk=(λ1)k[α1u1+α2(λ2λ1)ku2+...]
Let's implememt power iteration. We simply perform multiple matrix vector multiplications using a for loop:
x = x0
for i in range(40):
x = A @ x
print(x)
print('Exact eigenvalue = ',eigv[:,eigpos])
We can get the corresponding eigenvalue
np.dot(x,A@x)/np.dot(x,x)
Back to the beginning: Reset to the initial vector and normalize
x = x0/la.norm(x0)
Implement normalized power iteration. We will start with 10 iterations, and see what happens...
x = x0/la.norm(x0)
for i in range(10):
x = A @ x
nrm = la.norm(x)
x = x/nrm
print(x)
print('exact = ' ,eigv[:,eigpos])
print('eig_approx = ', np.dot(x,A@x)/np.dot(x,x))
xk=(λ1)kα1u1+(λ1)k(λ2λ1)kα2u2+(λ1)k[(λ3λ1)kα3u3+...]
In theory (or infinite precision calculations), if α1=0, power iteration will converge to a vector that is a multiple of the eigenvector u2.
In practice, it is unlikely that a random vector x0 will not have any component of u1. In the chances that happens, finite operations during the iterative process will usually introduce such component.
n = 4
D = np.diag([-5,2,4,3])
A = U@D@U.T
eigl, eigv = la.eig(A)
eig_index_sort = np.argsort(abs(eigl))
eigpos = eig_index_sort[-1]
print(eigl)
print(eigv[:,eigpos])
x = x0/la.norm(x0)
for i in range(40):
x = A @ x
nrm = la.norm(x)
x = x/nrm
print(x)
print('exact = ' ,eigv[:,eigpos])
print('eig_approx = ', np.dot(x,A@x)/np.dot(x,x))
print(D)
What is happening here? Note that the scalar that multiplies the eigenvector u1 in
xk=(λ1)kα1u1+(λ1)k(λ2λ1)kα2u2+(λ1)k[(λ3λ1)kα3u3+...]
is (λ1)k, and hence if the eigenvalue λ1 is negative, the solution of power iteration will converge to the eigenvector, but with alternating signs, i.e., u1 and −u1.
n = 4
D = np.diag([5,5,2,1])
A = U@D@U.T
eigl, eigv = la.eig(A)
print(eigl)
print(eigv[:,2])
print(eigv[:,3])
x = x0/la.norm(x0)
for i in range(40):
x = A @ x
nrm = la.norm(x)
x = x/nrm
print(x)
print('u1_exact = ' ,eigv[:,2])
print('u2_exact = ' ,eigv[:,3])
print('eig_approx = ', np.dot(x,A@x)/np.dot(x,x))
print(D)
In general, power method converges to:
xk=(λ1)kα1u1+(λ1)k(λ2λ1)kα2u2+(λ1)k[(λ3λ1)kα3u3+...]
However if |λ1|=|λ2| and λ1,λ2>0, we get:
xk=(λ1)k(α1u1+α2u2)+[...]
and hence the solution of power iteration will converge to a multiple of the linear combination of the eigenvectors u1 and u2.
n = 4
D = np.diag([-5,-5,2,1])
A = U@D@U.T
eigl, eigv = la.eig(A)
print(eigl)
print(eigv[:,2])
print(eigv[:,3])
x = x0/la.norm(x0)
for i in range(40):
x = A @ x
nrm = la.norm(x)
x = x/nrm
print(x)
print('u1_exact = ' ,eigv[:,2])
print('u2_exact = ' ,eigv[:,3])
print('eig_approx = ', np.dot(x,A@x)/np.dot(x,x))
print(D)
In general, power method converges to:
xk=(λ1)kα1u1+(λ1)k(λ2λ1)kα2u2+(λ1)k[(λ3λ1)kα3u3+...]
However if |λ1|=|λ2| and λ1,λ2<0, we get:
xk=±|λ1|k(α1u1+α2u2)+[...]
and hence the solution of power iteration will converge to a multiple of the linear combination of the eigenvectors u1 and u2, but the signs will flip at each step of the iterative method.
n = 4
D = np.diag([-5,5,2,1])
A = U@D@U.T
eigl, eigv = la.eig(A)
print(eigl)
print(eigv[:,0])
print(eigv[:,1])
x = x0/la.norm(x0)
for i in range(40):
x = A @ x
nrm = la.norm(x)
x = x/nrm
print(x)
print('u1_exact = ' ,eigv[:,2])
print('u2_exact = ' ,eigv[:,3])
print('eig_approx = ', np.dot(x,A@x)/np.dot(x,x))
print(D)
In general, power method converges to:
xk=(λ1)kα1u1+(λ1)k(λ2λ1)kα2u2+(λ1)k[(λ3λ1)kα3u3+...]
However if |λ1|=|λ2|, λ1,λ2 have opposite signs, we get:
xk=±|λ1|k(α1u1±α2u2)+[...]
and hence power iteration does not converge to one solution. Indeed, the method oscilates between two linear combination of eigenvectors, and fails to give the correct eigenvalue.
We want to approximate the eigenvalue u1 using the solution of power iteration
xk=(λ1)kα1u1+(λ1)k(λ2λ1)kα2u2+(λ1)k[(λ3λ1)kα3u3+...]
xk approaches a multiple of the eigenvector u1 as k→∞, hence
xk=(λ1)kα1u1
but also
xk+1=(λ1)k+1α1u1⟹xk+1=λ1xk
We can then approximate λ1 as the ratio of corresponding entries of the vectors xk+1 and xk, i.e.,
λ1≈(xk+1)j(xk)j
We define the approximated eigenvector as
uapprox=xk(λ1)kα1
and hence the error becomes the part of the power iteration solution that was neglected, i.e.,
e=uapprox−u1=(λ2λ1)kα2α1u2+[(λ3λ1)kα3α1u3+...]
and when k is large, we can write (again, we are assuming that |λ1|>|λ2|≥|λ3|≥|λ4|...
ek≈(λ2λ1)kα2α1u2
And when we take the norm of the error
||ek||≈|λ2λ1|k|α2α1|||u2||→||ek||=O(|λ2λ1|k)
We want to see what happens to the error from one iteration of the other of power iteration
||ek+1||||ek||=|λ2λ1|k+1|α2α1||λ2λ1|k|α2α1|=λ2λ1
Or in other words, we can say that the error decreases by a constant value, given as λ2λ1, at each iteration.
Power method has LINEAR convergence!
||ek+1||=λ2λ1||ek||
or we can also write ||ek+1||=(λ2λ1)k||e0||
Suppose you are given a matrix with eigenvalues:
[3,4,5]
You use normalized power iteration to approximate one of the eigenvectors, which is given as x, and we assume ||x||=1.
You knew the norm of the error of the initial guess was given as
||e0||=||x−x0||=0.3
How big will be the error after three rounds of power iteration? (Since all vectors have norm 1, the absolute and relative error are the same)
||e3||=|45|3||e0||=0.3|45|3=0.1536
n=4
lambda_array_ordered = [7, 3, -2, 1]
X = np.random.rand(n,n)
U,_ = sla.qr(X)
D = np.diag(lambda_array_ordered)
A = U@D@U.T
eigl, eigv = la.eig(A)
eig_index_sort = np.argsort(abs(eigl))
eigpos = eig_index_sort[-1]
u1_exact = eigv[:,eigpos]
print('Largest lambda = ', lambda_array_ordered[0])
print('Eigenvector = ', u1_exact)
print('Convergence rate = ', lambda_array_ordered[1]/lambda_array_ordered[0])
# Generate normalized initial guess
x0 = np.random.random(n)
x = x0/la.norm(x0)
count = 0
diff = 1
eigs = [x[0]]
error = [np.abs( eigs[-1] - lambda_array_ordered[0] )]
# We will use as stopping criteria the change in the
# approximation for the eigenvalue
while (diff > 1e-6 and count < 100):
count += 1
xnew = A@x #xk+1 = A xk
eigs.append(xnew[0]/x[0])
x = xnew/la.norm(xnew)
diff = np.abs( eigs[-1] - eigs[-2] )
error.append( np.abs( eigs[-1] - lambda_array_ordered[0] ) )
print("% 10f % 2e % 2f" %(eigs[-1], error[-1], error[-1]/error[-2]))
plt.semilogy(np.abs(error))
What if we are interested in the smaller eigenvalue in magnitude?
Suppose x,λ is an eigenpair of A, such that Ax=λx. What would be an eigenvalue of A−1?
A−1Ax=A−1λx
Ix=λA−1x
1λx=A−1x
Hence 1λ is an eigenvalue of A−1 .
If we want to find the smallest eigenvalue in magnitude of A, we can perform power iteration using the matrix A−1 to find ˉλ=1λ, where ˉλ is the largest eigenvalue of A−1.
Let's implement that:
n = 4
D = np.diag([5,-1,2,7])
A = U@D@U.T
eigl, eigv = la.eig(A)
eig_index_sort = np.argsort(abs(eigl))
eig_index_sort
eigpos = eig_index_sort[0]
print(eigv[:,eigpos])
x0 = np.random.random(n)
nrm = la.norm(x0)
x = x0/nrm
for i in range(20):
x = la.inv(A)@x
x = x/la.norm(x)
print("lambdan = ",x.T@A@x/(x.T@x))
print("un = ", x)
Can you find ways to improve the code snippet above?
#Inverse Power iteration to get smallest eigenvalue
x0 = np.random.random(n)
nrm = la.norm(x0)
x = x0/nrm
P, L, Um = sla.lu(A)
for k in range(20):
y = sla.solve_triangular(L, np.dot(P.T, x), lower=True)
x = sla.solve_triangular(Um, y)
x = x/la.norm(x)
print("lambdan = ",x.T@A@x/(x.T@x))
print("un = ", x)
What if we want to find another eigenvalue that is not the largest or the smallest?
Suppose x,λ is an eigenpair of A, such that Ax=λx. We want to find the eigenvalues of the shifted inverse matrix (A−σI)−1
(A−σI)−1x=ˉλx
Ix=ˉλ(A−σI)x=ˉλ(λI−σI)x
ˉλ=1λ−σ
We could write the above eigenvalue problem as
B−1x=ˉλx
which can be solved by inverse power iteration, which will converge to the eigenvalue 1λ−σ
n = 4
D = np.diag([5,-7,2,10])
A = U@D@U.T
eigl, eigv = la.eig(A)
print(eigl)
eigv
#Shifted Inverse Power iteration
sigma = 1
x0 = np.random.random(n)
nrm = la.norm(x0)
x = x0/nrm
B = A - sigma*np.eye(n)
P, L, Um = sla.lu(B)
for k in range(20):
y = sla.solve_triangular(L, np.dot(P.T, x), lower=True)
x = sla.solve_triangular(Um, y)
x = x/la.norm(x)
print("lambdan = ",x.T@A@x/(x.T@x))
print("un = ", x)